research-article

Speeding-up reinforcement learning through abstraction and transfer learning

Authors:
Marcelo Li Koga

Universidade de Sao Paulo, Sao Paulo, Brazil

Universidade de Sao Paulo, Sao Paulo, Brazil
View Profile

,
Valdinei Freire da Silva

Universidade de Sao Paulo, Sao Paulo, Brazil

Universidade de Sao Paulo, Sao Paulo, Brazil
View Profile

,
Fabio Gagliardi Cozman

Universidade de Sao Paulo, Sao Paulo, Brazil

Universidade de Sao Paulo, Sao Paulo, Brazil
View Profile

,
Anna Helena Reali Costa

Universidade de Sao Paulo, Sao Paulo, Brazil

Universidade de Sao Paulo, Sao Paulo, Brazil
View Profile

AAMAS '13: Proceedings of the 2013 international conference on Autonomous agents and multi-agent systemsMay 2013Pages 119–126

Published:06 May 2013Publication History

AAMAS '13: Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems

Pages 119–126

ABSTRACT

We are interested in the following general question: is it possible to abstract knowledge that is generated while learning the solution of a problem, so that this abstraction can accelerate the learning process? Moreover, is it possible to transfer and reuse the acquired abstract knowledge to accelerate the learning process for future similar tasks? We propose a framework for conducting simultaneously two levels of reinforcement learning, where an abstract policy is learned while learning of a concrete policy for the problem, such that both policies are refined through exploration and interaction of the agent with the environment. We explore abstraction both to accelerate the learning process for an optimal concrete policy for the current problem, and to allow the application of the generated abstract policy in learning solutions for new problems. We report experiments in a robot navigation environment that show our framework to be effective in speeding up policy construction for practical problems and in generating abstractions that can be used to accelerate learning in new similar problems.

References

C. Boutilier, R. Dearden, and M. Goldszmidt. Stochastic dynamic programming with factored representations. Artificial Intelligence, 121(1--2):49--107, 2000. Google ScholarDigital Library
D. D. Castro, A. Tamar, and S. Mannor. Policy gradients with variance related risk criteria. In Proc. 29th Int. Conf. on Machine Learning (ICML-12), pages 935--942, New York, NY, USA, 2012. Omnipress.Google Scholar
V. F. da Silva, F. d. A. Pereira, and A. H. R. Costa. Finding memoryless probabilistic relational policies for inter-task reuse. In Proc. 14th Int. Conf. on Information Processing and Management of Uncertainty - IPMU'12, volume 298, pages 107--116. Springer Berlin Heidelberg, 2012.Google ScholarCross Ref
T. Degris, M. White, and R. Sutton. Off-policy actor-critic. In Proc. 29th Int. Conf. on Machine Learning (ICML-12), abs/1205.4839, New York, NY, USA, 2012. Omnipress.Google Scholar
M. Deisenroth and C. Rasmussen. Pilco: A model-based and data-efficient approach to policy search. In Proc. 28th Int. Conf. on Machine Learning (ICML-11), pages 465--472, New York, NY, USA, 2011. ACM.Google Scholar
F. Fernández, J. Garcıa, and M. Veloso. Probabilistic Policy Reuse for inter-task transfer learning. Robotics and Autonomous Systems, 58(7):866--871, July 2010. Google ScholarDigital Library
F. Fernández and M. Veloso. Probabilistic policy reuse in a reinforcement learning agent. Proc. 5th Int. Joint Conf. on Autonomous Agents and Multiagent Systems - AAMAS '06, pages 720--727, 2006. Google ScholarDigital Library
R. Gaudel and M. Sebag. Feature selection as a one-player game. In Proc. 27th Int. Conf. on Machine Learning (ICML-10), pages 359--366. Omnipress, 2010.Google ScholarDigital Library
A. Geramifard, F. Doshi, J. Redding, N. Roy, and J. How. Online discovery of feature dependencies. In Proc. 28th Int. Conf. on Machine Learning (ICML-11), pages 881--888, New York, NY, USA, 2011. ACM.Google Scholar
G. Konidaris and A. Barto. Skill discovery in continuous reinforcement learning domains using skill chaining. In Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, and A. Culotta, editors, Advances in Neural Information Processing Systems 22, pages 1015--1023. 2009.Google Scholar
A. Lazaric and M. Ghavamzadeh. Bayesian multi-task reinforcement learning. In Proc. 27th Int. Conf. on Machine Learning (ICML-10), pages 599--606. Omnipress, 2010.Google ScholarDigital Library
M. L. Littman. Memoryless policies: theoretical limitations and practical results. In 3rd Int. Conf. on Simulation of Adaptive Behavior: from animals to animats 3, pages 238--245. MIT Press, 1994. Google ScholarDigital Library
T. Matos, Y. P. Bergamo, V. F. da Silva, and A. H. R. Costa. Simultaneous Abstract and Concrete Reinforcement Learning. In Proc. 9th Symposium of Abstraction, Reformulation, and Approximation - SARA'11, pages 82--89. AAAI Press, 2011.Google Scholar
M. V. Otterlo. Reinforcement learning for relational MDPs. In Machine Learning Conference of Belgium and the Netherlands, pages 138--145, 2004.Google Scholar
M. V. Otterlo. The Logic of Adaptative Behaviour. IOS Press, Amsterdam, 2009.Google Scholar
C. Painter-Wakefield and R. Parr. Greedy algorithms for sparse reinforcement learning. In Proc. 29th Int. Conf. on Machine Learning (ICML-12), pages 1391--1398, New York, NY, USA, 2012. Omnipress.Google Scholar
J. Pazis and R. Parr. Generalized value functions for large action sets. In Proc. 28th Int. Conf. on Machine Learning (ICML-11), pages 1185--1192, New York, NY, USA, 2011. ACM.Google Scholar
M. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, Inc., 1994. Google ScholarDigital Library
F. Stulp and O. Sigaud. Path integral policy improvement with covariance matrix adaptation. In Proc. 29th Int. Conf. on Machine Learning (ICML-12), abs/1206.4621, New York, NY, USA, 2012. Omnipress.Google Scholar
Y. Sun, F. Gomez, M. Ring, and J. Schmidhuber. Incremental basis construction from temporal difference error. In Proc. 28th Int. Conf. on Machine Learning (ICML-11), pages 481--488, New York, NY, USA, 2011. ACM.Google Scholar
A. Tamar, D. D. Castro, and R. Meir. Integrating partial model knowledge in model free RL algorithms. In Proc. 28th Int. Conf. on Machine Learning (ICML-11), pages 305--312, New York, NY, USA, 2011. ACM.Google Scholar

Index Terms

Speeding-up reinforcement learning through abstraction and transfer learning
1. Computing methodologies
  1. Machine learning
2. Theory of computation
  1. Design and analysis of algorithms
    1. Algorithm design techniques
      1. Dynamic programming

Recommendations

Learning relational options for inductive transfer in relational reinforcement learning
ILP'07: Proceedings of the 17th international conference on Inductive logic programming

In reinforcement learning problems, an agent has the task of learning a good or optimal strategy from interaction with his environment. At the start of the learning task, the agent usually has very little information. Therefore, when faced with complex ...
Read More
Using Transfer Learning to Speed-Up Reinforcement Learning: A Cased-Based Approach
LARS '10: Proceedings of the 2010 Latin American Robotics Symposium and Intelligent Robotics Meeting

Reinforcement Learning (RL) is a well-known technique for the solution of problems where agents need to act with success in an unknown environment, learning through trial and error. However, this technique is not efficient enough to be used in ...
Read More
Big data, lifelong machine learning and transfer learning
WSDM '13: Proceedings of the sixth ACM international conference on Web search and data mining

A major challenge in today's world is the Big Data problem, which manifests itself in Web and Mobile domains as rapidly changing and heterogeneous data streams. A data-mining system must be able to cope with the influx of changing data in a continual ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
AAMAS '13: Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
May 2013
1500 pages
ISBN:9781450319935
General Chairs:
Maria Gini
University of Minnesota, USA
,
Onn Shehory
IBM Haifa Research Lab, Israel
,
Program Chairs:
Takayuki Ito
Nagoya Institute of Technology, Japan
,
Catholijn Jonker
Delft Institute of Technology, The Netherlands
Sponsors
In-Cooperation
Publisher
International Foundation for Autonomous Agents and Multiagent Systems
Richland, SC
Publication History
- Published: 6 May 2013
Check for updates
Author Tags
evolution and adaptation
machine learning for robotics
single agent learning
transfer learning
Qualifiers
- research-article
Conference

Acceptance Rates
AAMAS '13 Paper Acceptance Rate140of599submissions,23%Overall Acceptance Rate1,155of5,036submissions,23%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 190
  Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Speeding-up reinforcement learning through abstraction and transfer learning

AAMAS '13: Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems

ABSTRACT

References

Cited By

Index Terms

Recommendations

Learning relational options for inductive transfer in relational reinforcement learning

Using Transfer Learning to Speed-Up Reinforcement Learning: A Cased-Based Approach

Big data, lifelong machine learning and transfer learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Speeding-up reinforcement learning through abstraction and transfer learning

AAMAS '13: Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems

ABSTRACT

References

Cited By

Index Terms

Recommendations

Learning relational options for inductive transfer in relational reinforcement learning

Using Transfer Learning to Speed-Up Reinforcement Learning: A Cased-Based Approach

Big data, lifelong machine learning and transfer learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media